86 research outputs found

    Detecting Strong Ties Using Network Motifs

    Full text link
    Detecting strong ties among users in social and information networks is a fundamental operation that can improve performance on a multitude of personalization and ranking tasks. Strong-tie edges are often readily obtained from the social network as users often participate in multiple overlapping networks via features such as following and messaging. These networks may vary greatly in size, density and the information they carry. This setting leads to a natural strong tie detection task: given a small set of labeled strong tie edges, how well can one detect unlabeled strong ties in the remainder of the network? This task becomes particularly daunting for the Twitter network due to scant availability of pairwise relationship attribute data, and sparsity of strong tie networks such as phone contacts. Given these challenges, a natural approach is to instead use structural network features for the task, produced by {\em combining} the strong and "weak" edges. In this work, we demonstrate via experiments on Twitter data that using only such structural network features is sufficient for detecting strong ties with high precision. These structural network features are obtained from the presence and frequency of small network motifs on combined strong and weak ties. We observe that using motifs larger than triads alleviate sparsity problems that arise for smaller motifs, both due to increased combinatorial possibilities as well as benefiting strongly from searching beyond the ego network. Empirically, we observe that not all motifs are equally useful, and need to be carefully constructed from the combined edges in order to be effective for strong tie detection. Finally, we reinforce our experimental findings with providing theoretical justification that suggests why incorporating these larger sized motifs as features could lead to increased performance in planted graph models.Comment: To appear in Proceedings of WWW 2017 (Web-science track

    Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

    Full text link
    We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

    THE PHYSIO ANATOMICAL VIEW OF KLEDAKA KAPHA

    Get PDF
    Dosha, Dathu and Mala form the elemental cause of human body. Our body fails to exist without these three chief constituents. The concept of Tridosha is unique contribution of Ayurveda. They are Vata, Pitta and Kapha. These constitute the anatomical and physiological aspect of human body in general. Kapha is the element which gives stability and endurance to the body. There are 5 types of Kapha according to the location and function. Kledaka is one among them. It is located in Amashaya where the major part of digestion occurs. It is anatomically the stomach. Its function is to moisten and disintegrate the ingested food particles. It also protects the stomach from self digestion. It supports other Kapha sthanas of the body. According to its location and function given in Ayurvedic classics, its role in digestion can be assessed. When we consider the functions; Kledaka Kapha can be correlated with the gastric mucus which is secreted by surface epithelial cells of gastric mucosal layer and cells of gastric glands. The functions of gastric mucus are to lubricate the food particle for the formation of chime and to protect the gastric wall.The paper is intended to explore the physio anatomical aspect of Kledaka Kapha and its action as gastric mucus and its importance in digestion and metabolism of food

    The impossibility of low rank representations for triangle-rich complex networks

    Full text link
    The study of complex networks is a significant development in modern science, and has enriched the social sciences, biology, physics, and computer science. Models and algorithms for such networks are pervasive in our society, and impact human behavior via social networks, search engines, and recommender systems to name a few. A widely used algorithmic technique for modeling such complex networks is to construct a low-dimensional Euclidean embedding of the vertices of the network, where proximity of vertices is interpreted as the likelihood of an edge. Contrary to the common view, we argue that such graph embeddings do not}capture salient properties of complex networks. The two properties we focus on are low degree and large clustering coefficients, which have been widely established to be empirically true for real-world networks. We mathematically prove that any embedding (that uses dot products to measure similarity) that can successfully create these two properties must have rank nearly linear in the number of vertices. Among other implications, this establishes that popular embedding techniques such as Singular Value Decomposition and node2vec fail to capture significant structural aspects of real-world complex networks. Furthermore, we empirically study a number of different embedding techniques based on dot product, and show that they all fail to capture the triangle structure

    Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion

    Full text link
    Graph neural networks are widely used tools for graph prediction tasks. Motivated by their empirical performance, prior works have developed generalization bounds for graph neural networks, which scale with graph structures in terms of the maximum degree. In this paper, we present generalization bounds that instead scale with the largest singular value of the graph neural network's feature diffusion matrix. These bounds are numerically much smaller than prior bounds for real-world graphs. We also construct a lower bound of the generalization gap that matches our upper bound asymptotically. To achieve these results, we analyze a unified model that includes prior works' settings (i.e., convolutional and message-passing networks) and new settings (i.e., graph isomorphism networks). Our key idea is to measure the stability of graph neural networks against noise perturbations using Hessians. Empirically, we find that Hessian-based measurements correlate with the observed generalization gaps of graph neural networks accurately. Optimizing noise stability properties for fine-tuning pretrained graph neural networks also improves test performance on several graph-level classification tasks.Comment: 36 pages, 2 tables, 3 figures. Appeared in AISTATS 202

    An Experimental Study of Structural Diversity in Social Networks

    Full text link
    Several recent studies of online social networking platforms have found that adoption rates and engagement levels are positively correlated with structural diversity, the degree of heterogeneity among an individual's contacts as measured by network ties. One common theory for this observation is that structural diversity increases utility, in part because there is value to interacting with people from different network components on the same platform. While compelling, evidence for this causal theory comes from observational studies, making it difficult to rule out non-causal explanations. We investigate the role of structural diversity on retention by conducting a large-scale randomized controlled study on the Twitter platform. We first show that structural diversity correlates with user retention on Twitter, corroborating results from past observational studies. We then exogenously vary structural diversity by altering the set of network recommendations new users see when joining the platform; we confirm that this design induces the desired changes to network topology. We find, however, that low, medium, and high structural diversity treatment groups in our experiment have comparable retention rates. Thus, at least in this case, the observed correlation between structural diversity and retention does not appear to result from a causal relationship, challenging theories based on past observational studies.Comment: To appear in the Proceedings of International AAAI Conference on Web and Social Media (ICWSM 2020

    Reducing Uncertainty in Sea-level Rise Prediction: A Spatial-variability-aware Approach

    Full text link
    Given multi-model ensemble climate projections, the goal is to accurately and reliably predict future sea-level rise while lowering the uncertainty. This problem is important because sea-level rise affects millions of people in coastal communities and beyond due to climate change's impacts on polar ice sheets and the ocean. This problem is challenging due to spatial variability and unknowns such as possible tipping points (e.g., collapse of Greenland or West Antarctic ice-shelf), climate feedback loops (e.g., clouds, permafrost thawing), future policy decisions, and human actions. Most existing climate modeling approaches use the same set of weights globally, during either regression or deep learning to combine different climate projections. Such approaches are inadequate when different regions require different weighting schemes for accurate and reliable sea-level rise predictions. This paper proposes a zonal regression model which addresses spatial variability and model inter-dependency. Experimental results show more reliable predictions using the weights learned via this approach on a regional scale.Comment: 6 pages, 5 figures, I-GUIDE 2023 conferenc

    FinderNet: A Data Augmentation Free Canonicalization aided Loop Detection and Closure technique for Point clouds in 6-DOF separation

    Full text link
    We focus on the problem of LiDAR point cloud based loop detection (or Finding) and closure (LDC) in a multi-agent setting. State-of-the-art (SOTA) techniques directly generate learned embeddings of a given point cloud, require large data transfers, and are not robust to wide variations in 6 Degrees-of-Freedom (DOF) viewpoint. Moreover, absence of strong priors in an unstructured point cloud leads to highly inaccurate LDC. In this original approach, we propose independent roll and pitch canonicalization of the point clouds using a common dominant ground plane. Discretization of the canonicalized point cloud along the axis perpendicular to the ground plane leads to an image similar to Digital Elevation Maps (DEMs), which exposes strong spatial priors in the scene. Our experiments show that LDC based on learnt embeddings of such DEMs is not only data efficient but also significantly more robust, and generalizable than the current SOTA. We report significant performance gain in terms of Average Precision for loop detection and absolute translation/rotation error for relative pose estimation (or loop closure) on Kitti, GPR and Oxford Robot Car over multiple SOTA LDC methods. Our encoder technique allows to compress the original point cloud by over 830 times. To further test the robustness of our technique we create and opensource a custom dataset called Lidar-UrbanFly Dataset (LUF) which consists of point clouds obtained from a LiDAR mounted on a quadrotor
    corecore